The integration of artificial intelligence (AI) into healthcare has emerged as a transformative force revolutionizing diagnostics and treatment. The urgency of the COVID-19 pandemic has underscored the critical need for rapid and accurate diagnostic tools. One such innovation that holds immense promise is the development of AI prediction models for classifying medical images in respiratory health.
A subset of an open COVID-19 Radiography Dataset from Kaggle (with all credits attributed to Preet Viradiya, Juliana Negrini De Araujo, Tawsifur Rahman, Muhammad Chowdhury and Amith Khandakar) was used for the analysis as consolidated from the following primary sources:
This study hypothesized that images contain a hierarchy of features which allows the differentiation and classification across various image categories.
Subsequent analysis and modelling steps involving data understanding, data preparation, data exploration, model development, model validation and model presentation were individually detailed below, with all the results consolidated in a Summary provided at the end of the document.
The main objective of the study is to develop multiple convolutional neural network classification models that could automatically learn hierarchical features directly from raw pixel data of x-ray images (categorized as Normal, Viral Pneumonia, and COVID-19), while delivering accurate predictions when applied to new unseen data.
Specific objectives are given as follows:
Obtain an optimal subset of observations by conducting data quality assessment and applying preprocessing operations to improve generalization and reduce sensitivity to variations most suitable for the downstream analysis
Develop multiple convolutional neural network models with remedial measures applied to prevent overfitting and improve the stability of the training process
Select the final classification model among candidates based on robust performance estimates
Evaluate the final model performance and generalization ability through external validation in an independent set
Conduct a post-hoc exploration of the model results to provide general insights on the importance, contribution and effect of the various hierarchical features to model prediction
The analysis endpoint for the study is described below:
The hierarchical representation of image features enables the network to transform raw pixel data into a meaningful and compact representation, allowing it to make accurate predictions during image classification. The different features automatically learned during the training process are as follows:
Preliminary images used in the study were evaluated and prepared for analysis and modelling using the following methods:
Data Quality Assessment involves profiling and assessing the data to understand its suitability for machine learning tasks. The quality of training data has a huge impact on the efficiency, accuracy and complexity of machine learning tasks. Data remains susceptible to errors or irregularities that may be introduced during collection, aggregation or annotation stage. Issues such as incorrect labels or synonymous categories in a categorical variable, among others, which might go undetected by standard pre-processing modules in these frameworks can lead to sub-optimal model performance, inaccurate analysis and unreliable decisions.
Data Preprocessing involves changing the raw feature of the input images into a representation that is more suitable for the downstream modelling and estimation processes, including image size standardization, pixel scaling, pixel normalization and image augmentation. Resizing images to a consistent size ensures compatibility with the network architecture, enabling efficient batch processing and avoiding issues related to varying input dimensions. Normalizing pixel values to a common scale helps in achieving numerical stability during training by allowing all input features to contribute equally to the learning process, preventing certain features from dominating due to different scales. Augmentation methods which involve the application of random transformations to create new training samples artificially increases the size of the dataset and helps the model generalize better to unseen data while reducing the risk of overfitting.
Data Exploration involves analyzing and investigating image data sets to summarize their main characteristics, often employing class-level summary statistics and data visualization methods for the image pixel intensity values. This process aids in providing insights into the diversity, structure, and characteristics of the image data, helping guide preprocessing decisions and providing a better understanding of the challenges and opportunities in modeling with CNNs.
Convolutional neural network (CNN) models automatically learn hierarchical representations of features from raw pixel values without relying on handcrafted features with details described as follows:
Hierarchical Feature Extraction enables CNN models to capture increasingly abstract and discriminative information as input data are processed through multiple layers. This hierarchical process involves capturing low-level features in the initial layers and gradually constructing more complex and abstract features in deeper layers. The initial layers of a CNN primarily focus on capturing lower-level features, such as edges and corners. These features are extracted using small filters (kernels) that slide over the input image. Convolutional layers, followed by activation functions like rectified linear units (RELU), help identify and enhance these basic patterns. Pooling layers are inserted after convolutional layers which reduce the spatial dimensions of the feature maps, emphasizing the most important information and discarding less relevant details. This pooling operation enables the model to focus on more abstract spatial hierarchies. As the network progresses through deeper layers, it starts to capture mid-level to higher-level features representing more complex structures, such as textures and patterns.
Machine Learning Classification Models are algorithms that learn to assign predefined categories or labels to input data based on patterns and relationships identified during the training phase. Classification is a supervised learning task, meaning the models are trained on a labeled dataset where the correct output (class or label) is known for each input. Once trained, these models can predict the class of new, unseen instances.
This study implemented both glass-box and black-box classification modelling procedures with simple to complex structures involving moderate to large numbers of model coefficients or mathematical transformations which lacked transparency in terms of the internal processes and weighted factors used in reaching a decision. Models applied in the analysis for predicting the categorical target were the following:
Convolutional Neural Network Models are a neural network architecture specifically designed for image classification and computer vision tasks by automatically learning hierarchical features directly from raw pixel data. The core building block of a CNN is the convolutional layer. Convolution operations apply learnable filters (kernels) to input images to detect patterns such as edges, textures, and more complex structures. The layers systematically learn hierarchical features from low-level (e.g., edges) to high-level (e.g., object parts) as the network deepens. Filters are shared across the entire input space, enabling the model to recognize patterns regardless of their spatial location. After convolutional operations, an activation function is applied element-wise to introduce non-linearity and allow the model to learn complex relationships between features. Pooling layers downsample the spatial dimensions of the feature maps, reducing the computational load and the number of parameters in the network - creating spatial hierarchy and translation invariance. Fully connected layers process the flattened features to make predictions and produce an output vector that corresponds to class probabilities using an activation function. The CNN is trained using backpropagation and optimization algorithms. A loss function is used to measure the difference between predicted and actual labels. The network adjusts its weights to minimize this loss. Gradients are calculated with respect to the loss, and the weights are updated accordingly through a backpropagation mechanism.
All hyperparameter settings used during the model development process were fixed based on heuristics, given that training deep CNNs is computationally expensive. Performing an exhaustive search over the high dimensional hyperparameter space (including the number of layers, layer types, filter sizes, strides, learning rates and batch sizes, among others) becomes impractical due to the time and resources required for each training iteration. Internal model evaluation involved the following approach:
Split-Sample Holdout Validation involves dividing the training set after a random shuffle into training and testing sets given the lack of inherent structure or temporal ordering in the data.
The predictive performance of the formulated classification models in the study were compared and evaluated using the following metrics:
Precision is the ratio of correctly predicted positive observations to the total predicted positives. It is useful when the cost of false positives is high but does not consider false negatives, so might not be suitable for imbalanced datasets.
Recall is the ratio of correctly predicted positive observations to all the actual positives. It is useful when the cost of false negatives is high but does not consider false positives, so might not be suitable for imbalanced datasets.
F1 Score is the harmonic mean of precision and recall. It balances precision and recall, providing a single metric for performance evaluation which is suitable for imbalanced datasets.Although, it might not be the best metric in situations where precision or recall is more critical.
Model presentation was conducted post-hoc and focused on both model-specific and model-agnostic techniques which did not consider any assumptions about the model structures. These methods were described as follows:
Convolutional Layer Filter Visualization helps in understanding what specific patterns or features the CNN has learned during the training process. Given that convolutional layers learn filters act as feature extractors, visualizing these filters can provide insights into the types of patterns or textures the network is sensitive to. In addition, image representations of filters allows the assessment of how the complexity of features evolve through the network with low-level features such as edges or textures captured in the earlier layers, while filters in deeper layers detecting more abstract and complex features. By applying learned filters to an input image, it is possible to visualize which regions of the image activate specific filters the most. This can aid in identifying which parts of the input contribute most to the response of a particular filter, providing insights into what the network focuses on.
Gradient-Weighted Class Activation Maps highlight the regions of an input image that contribute the most to a specific class prediction from a CNN model by providing a heatmap that indicates the importance of different regions in the input image for a particular classification decision. Grad-CAM helps identify which regions of the input image are crucial for a CNN's decision on a specific class. It provides a localization map that highlights the relevant parts of the image that contribute to the predicted class. By overlaying the Grad-CAM heatmap on the original image, one can visually understand where the model is focusing its attention when making predictions. This spatial understanding is particularly valuable for tasks such as object detection or segmentation.
##################################
# Installing important packages
##################################
# !pip install mlxtend
# !pip install --upgrade tensorflow
# !pip install opencv-python
# !pip install keras==2.12.0
##################################
# Loading Python Libraries
# for Data Loading,
# Data Preprocessing and
# Exploratory Data Analysis
##################################
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import matplotlib.cm as cm
from matplotlib.offsetbox import OffsetImage, AnnotationBbox
%matplotlib inline
import tensorflow as tf
import keras
from PIL import Image
from glob import glob
import cv2
import os
import random
WARNING:tensorflow:From C:\Users\John pauline magno\AppData\Roaming\Python\Python311\site-packages\keras\losses.py:2664: The name tf.losses.sparse_softmax_cross_entropy is deprecated. Please use tf.compat.v1.losses.sparse_softmax_cross_entropy instead.
##################################
# Loading Python Libraries
# for Model Development
##################################
from keras import backend as K
from keras import regularizers
from keras.models import Sequential, Model,load_model
from keras.layers import Activation, Dense, Dropout, Flatten, Conv2D, MaxPooling2D, MaxPool2D, AveragePooling2D, GlobalMaxPooling2D, BatchNormalization
from keras.wrappers.scikit_learn import KerasClassifier
from keras.utils.np_utils import to_categorical
from keras.optimizers import Adam, SGD
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ReduceLROnPlateau, EarlyStopping, ModelCheckpoint
from tensorflow.keras.utils import img_to_array, array_to_img, load_img
##################################
# Loading Python Libraries
# for Model Evaluation
##################################
from keras.metrics import PrecisionAtRecall, Recall
from sklearn.metrics import confusion_matrix
from sklearn.metrics import precision_recall_fscore_support, accuracy_score
##################################
# Setting random seed options
# for the analysis
##################################
def set_seed(seed=88888888):
np.random.seed(seed)
tf.random.set_seed(seed)
keras.utils.set_random_seed(seed)
random.seed(seed)
tf.config.experimental.enable_op_determinism()
os.environ['TF_DETERMINISTIC_OPS'] = "1"
os.environ['TF_CUDNN_DETERMINISM'] = "1"
os.environ['PYTHONHASHSEED'] = str(seed)
set_seed()
##################################
# Loading the dataset
##################################
path = 'C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset/'
##################################
# Defining the image category levels
##################################
diagnosis_code_dictionary = {'COVID': 0,
'Normal': 1,
'Viral Pneumonia': 2}
##################################
# Defining the image category descriptions
##################################
diagnosis_description_dictionary = {'COVID': 'Covid-19',
'Normal': 'Healthy',
'Viral Pneumonia': 'Viral Pneumonia'}
##################################
# Consolidating the image path
##################################
imageid_path_dictionary = {os.path.splitext(os.path.basename(x))[0]: x for x in glob(os.path.join(path, '*','*.png'))}
##################################
# Taking a snapshot of the dictionary
##################################
dict(list(imageid_path_dictionary.items())[0:5])
{'COVID-1': 'C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset\\COVID\\COVID-1.png',
'COVID-10': 'C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset\\COVID\\COVID-10.png',
'COVID-100': 'C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset\\COVID\\COVID-100.png',
'COVID-1000': 'C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset\\COVID\\COVID-1000.png',
'COVID-1001': 'C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset\\COVID\\COVID-1001.png'}
##################################
# Consolidating the information
# from the dataset
# into a dataframe
##################################
xray_images = pd.DataFrame.from_dict(imageid_path_dictionary, orient = 'index').reset_index()
xray_images.columns = ['Image_ID','Path']
classes = xray_images.Image_ID.str.split('-').str[0]
xray_images['Diagnosis'] = classes
xray_images['Target'] = xray_images['Diagnosis'].map(diagnosis_code_dictionary.get)
xray_images['Class'] = xray_images['Diagnosis'].map(diagnosis_description_dictionary.get)
##################################
# Performing a general exploration of the dataset
##################################
print('Dataset Dimensions: ')
display(xray_images.shape)
Dataset Dimensions:
(3600, 5)
##################################
# Listing the column names and data types
##################################
print('Column Names and Data Types:')
display(xray_images.dtypes)
Column Names and Data Types:
Image_ID object Path object Diagnosis object Target int64 Class object dtype: object
##################################
# Taking a snapshot of the dataset
##################################
xray_images.head()
| Image_ID | Path | Diagnosis | Target | Class | |
|---|---|---|---|---|---|
| 0 | COVID-1 | C:/Users/John pauline magno/Python Notebooks/C... | COVID | 0 | Covid-19 |
| 1 | COVID-10 | C:/Users/John pauline magno/Python Notebooks/C... | COVID | 0 | Covid-19 |
| 2 | COVID-100 | C:/Users/John pauline magno/Python Notebooks/C... | COVID | 0 | Covid-19 |
| 3 | COVID-1000 | C:/Users/John pauline magno/Python Notebooks/C... | COVID | 0 | Covid-19 |
| 4 | COVID-1001 | C:/Users/John pauline magno/Python Notebooks/C... | COVID | 0 | Covid-19 |
##################################
# Performing a general exploration of the numeric variables
##################################
print('Numeric Variable Summary:')
display(xray_images.describe(include='number').transpose())
Numeric Variable Summary:
| count | mean | std | min | 25% | 50% | 75% | max | |
|---|---|---|---|---|---|---|---|---|
| Target | 3600.0 | 1.0 | 0.81661 | 0.0 | 0.0 | 1.0 | 2.0 | 2.0 |
##################################
# Performing a general exploration of the object variable
##################################
print('Object Variable Summary:')
display(xray_images.describe(include='object').transpose())
Object Variable Summary:
| count | unique | top | freq | |
|---|---|---|---|---|
| Image_ID | 3600 | 3600 | COVID-1 | 1 |
| Path | 3600 | 3600 | C:/Users/John pauline magno/Python Notebooks/C... | 1 |
| Diagnosis | 3600 | 3 | COVID | 1200 |
| Class | 3600 | 3 | Covid-19 | 1200 |
##################################
# Performing a general exploration of the target variable
##################################
xray_images.Diagnosis.value_counts()
COVID 1200 Normal 1200 Viral Pneumonia 1200 Name: Diagnosis, dtype: int64
##################################
# Performing a general exploration of the target variable
##################################
xray_images.Diagnosis.value_counts(normalize=True)
COVID 0.333333 Normal 0.333333 Viral Pneumonia 0.333333 Name: Diagnosis, dtype: float64
Data quality findings based on assessment are as follows:
##################################
# Counting the number of duplicated images
##################################
xray_images.duplicated().sum()
0
##################################
# Gathering the number of null images
##################################
xray_images.isnull().sum()
Image_ID 0 Path 0 Diagnosis 0 Target 0 Class 0 dtype: int64
##################################
# Including the pixel information
# of the actual images
# in array format
# into a dataframe
##################################
xray_images['Image'] = xray_images['Path'].map(lambda x: np.asarray(Image.open(x).resize((75,75))))
##################################
# Listing the column names and data types
##################################
print('Column Names and Data Types:')
display(xray_images.dtypes)
Column Names and Data Types:
Image_ID object Path object Diagnosis object Target int64 Class object Image object dtype: object
##################################
# Taking a snapshot of the dataset
##################################
xray_images.head()
| Image_ID | Path | Diagnosis | Target | Class | Image | |
|---|---|---|---|---|---|---|
| 0 | COVID-1 | C:/Users/John pauline magno/Python Notebooks/C... | COVID | 0 | Covid-19 | [[15, 0, 0, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0... |
| 1 | COVID-10 | C:/Users/John pauline magno/Python Notebooks/C... | COVID | 0 | Covid-19 | [[129, 125, 123, 121, 119, 117, 114, 104, 104,... |
| 2 | COVID-100 | C:/Users/John pauline magno/Python Notebooks/C... | COVID | 0 | Covid-19 | [[11, 0, 0, 3, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0... |
| 3 | COVID-1000 | C:/Users/John pauline magno/Python Notebooks/C... | COVID | 0 | Covid-19 | [[42, 39, 38, 42, 38, 35, 31, 26, 24, 24, 24, ... |
| 4 | COVID-1001 | C:/Users/John pauline magno/Python Notebooks/C... | COVID | 0 | Covid-19 | [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 2, 1, 0,... |
##################################
# Taking a snapshot of the dataset
##################################
n_samples = 5
fig, m_axs = plt.subplots(3, n_samples, figsize = (3*n_samples, 8))
for n_axs, (type_name, type_rows) in zip(m_axs, xray_images.sort_values(['Diagnosis']).groupby('Diagnosis')):
n_axs[2].set_title(type_name, fontsize = 14, weight = 'bold')
for c_ax, (_, c_row) in zip(n_axs, type_rows.sample(n_samples, random_state=123).iterrows()):
picture = c_row['Path']
image = cv2.imread(picture)
c_ax.imshow(image)
c_ax.axis('off')
##################################
# Sampling a single image
##################################
samples, features = xray_images.shape
plt.figure()
pic_id = random.randrange(0, samples)
picture = xray_images['Path'][pic_id]
image = cv2.imread(picture)
<Figure size 640x480 with 0 Axes>
##################################
# Plotting using subplots
##################################
plt.figure(figsize=(15, 5))
##################################
# Formulating the original image
##################################
plt.subplot(1, 4, 1)
plt.imshow(image)
plt.title('Original Image', fontsize = 14, weight = 'bold')
plt.axis('off')
##################################
# Formulating the blue channel
##################################
plt.subplot(1, 4, 2)
plt.imshow(image[ : , : , 0])
plt.title('Blue Channel', fontsize = 14, weight = 'bold')
plt.axis('off')
##################################
# Formulating the green channel
##################################
plt.subplot(1, 4, 3)
plt.imshow(image[ : , : , 1])
plt.title('Green Channel', fontsize = 14, weight = 'bold')
plt.axis('off')
##################################
# Formulating the red channel
##################################
plt.subplot(1, 4, 4)
plt.imshow(image[ : , : , 2])
plt.title('Red Channel', fontsize = 14, weight = 'bold')
plt.axis('off')
##################################
# Consolidating all images
##################################
plt.show()
##################################
# Determining the image shape
##################################
print('Image Shape:')
display(image.shape)
Image Shape:
(299, 299, 3)
##################################
# Determining the image height
##################################
print('Image Height:')
display(image.shape[0])
Image Height:
299
##################################
# Determining the image width
##################################
print('Image Width:')
display(image.shape[0])
Image Width:
299
##################################
# Determining the image dimension
##################################
print('Image Dimension:')
display(image.ndim)
Image Dimension:
3
##################################
# Determining the image size
##################################
print('Image Size:')
display(image.size)
Image Size:
268203
##################################
# Determining the image data type
##################################
print('Image Data Type:')
display(image.dtype)
Image Data Type:
dtype('uint8')
##################################
# Determining the maximum RGB value
##################################
print('Image Maximum RGB:')
display(image.max())
Image Maximum RGB:
205
##################################
# Determining the minimum RGB value
##################################
print('Image Minimum RGB:')
display(image.min())
Image Minimum RGB:
10
##################################
# Identifying the path for the images
# and defining image categories
##################################
path = 'C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset'
classes=["COVID", "Normal", "Viral Pneumonia"]
num_classes = len(classes)
batch_size = 16
##################################
# Creating subsets of images
# for model training and
# setting the parameters for
# real-time data augmentation
# at each epoch
##################################
set_seed()
train_datagen = ImageDataGenerator(rescale=1./255,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
vertical_flip=True,
shear_range=0.2,
zoom_range=0.2,
validation_split=0.2)
##################################
# Loading the model training images
##################################
train_gen = train_datagen.flow_from_directory(directory=path,
target_size=(299, 299),
class_mode='categorical',
subset='training',
shuffle=True,
classes=classes,
batch_size=batch_size,
color_mode="grayscale")
Found 2880 images belonging to 3 classes.
##################################
# Loading samples of augmented images
# for the training set
##################################
fig, axes = plt.subplots(1, 5, figsize=(15, 3))
for i in range(5):
batch = next(train_gen)
images, labels = batch
axes[i].imshow(images[0])
axes[i].set_title(f"Label: {labels[0]}")
axes[i].axis('off')
plt.show()
##################################
# Creating subsets of images
# for model validation
# setting the parameters for
# real-time data augmentation
# at each epoch
##################################
set_seed()
test_datagen = ImageDataGenerator(rescale=1./255,
validation_split=0.2)
##################################
# Loading the model evaluation images
##################################
test_gen = test_datagen.flow_from_directory(directory=path,
target_size=(299, 299),
class_mode='categorical',
subset='validation',
shuffle=False,
classes=classes,
batch_size=batch_size,
color_mode="grayscale")
Found 720 images belonging to 3 classes.
##################################
# Loading samples of augmented images
# for the validation set
##################################
fig, axes = plt.subplots(1, 5, figsize=(15, 3))
for i in range(5):
batch = next(test_gen)
images, labels = batch
axes[i].imshow(images[0])
axes[i].set_title(f"Label: {labels[0]}")
axes[i].axis('off')
plt.show()
##################################
# Consolidating summary statistics
# for the image pixel values
##################################
mean_val = []
std_dev_val = []
max_val = []
min_val = []
for i in range(0, samples):
mean_val.append(xray_images['Image'][i].mean())
std_dev_val.append(np.std(xray_images['Image'][i]))
max_val.append(xray_images['Image'][i].max())
min_val.append(xray_images['Image'][i].min())
imageEDA = xray_images.loc[:,['Image', 'Class','Path']]
imageEDA['Mean'] = mean_val
imageEDA['StDev'] = std_dev_val
imageEDA['Max'] = max_val
imageEDA['Min'] = min_val
##################################
# Consolidating the overall mean
# for the pixel intensity means
# grouped by categories
##################################
imageEDA.groupby(['Class'])['Mean'].mean()
Class Covid-19 143.712634 Healthy 122.619439 Viral Pneumonia 125.310461 Name: Mean, dtype: float64
##################################
# Consolidating the overall minimum
# for the pixel intensity means
# grouped by categories
##################################
imageEDA.groupby(['Class'])['Mean'].min()
Class Covid-19 46.677511 Healthy 73.304356 Viral Pneumonia 64.771022 Name: Mean, dtype: float64
##################################
# Consolidating the overall maximum
# for the pixel intensity means
# grouped by categories
##################################
imageEDA.groupby(['Class'])['Mean'].max()
Class Covid-19 216.570667 Healthy 175.906667 Viral Pneumonia 179.011911 Name: Mean, dtype: float64
##################################
# Consolidating the overall standard deviation
# for the pixel intensity means
# grouped by categories
##################################
imageEDA.groupby(['Class'])['Mean'].std()
Class Covid-19 22.160832 Healthy 13.716765 Viral Pneumonia 19.052677 Name: Mean, dtype: float64
##################################
# Formulating the mean distribution
# by category of the image pixel values
##################################
sns.displot(data = imageEDA, x = 'Mean', kind="kde", hue = 'Class', height=6, aspect=1.40)
plt.title('Image Pixel Intensity Mean Distribution by Category', fontsize=14, weight='bold');
##################################
# Formulating the maximum distribution
# by category of the image pixel values
##################################
sns.displot(data = imageEDA, x = 'Max', kind="kde", hue = 'Class', height=6, aspect=1.40)
plt.title('Image Pixel Intensity Maximum Distribution by Category', fontsize=14, weight='bold');
##################################
# Formulating the minimum distribution
# by category of the image pixel values
##################################
sns.displot(data = imageEDA, x = 'Min', kind="kde", hue = 'Class', height=6, aspect=1.40)
plt.title('Image Pixel Intensity Minimum Distribution by Category', fontsize=14, weight='bold');
##################################
# Formulating the standard deviation distribution
# by category of the image pixel values
##################################
sns.displot(data = imageEDA, x = 'StDev', kind="kde", hue = 'Class', height=6, aspect=1.40)
plt.title('Image Pixel Intensity Standard Deviation Distribution by Category', fontsize=14, weight='bold');
##################################
# Formulating the mean and standard deviation
# scatterplot distribution
# by category of the image pixel values
##################################
plt.figure(figsize=(10,6))
sns.set(style="ticks", font_scale = 1)
ax = sns.scatterplot(data=imageEDA, x="Mean", y=imageEDA['StDev'], hue='Class', alpha=0.5)
sns.despine(top=True, right=True, left=False, bottom=False)
plt.xticks(rotation=0, fontsize = 12)
ax.set_xlabel('Image Pixel Intensity Mean',fontsize=14, weight='bold')
ax.set_ylabel('Image Pixel Intensity Standard Deviation', fontsize=14, weight='bold')
plt.title('Image Pixel Intensity Mean and Standard Deviation Distribution', fontsize = 14, weight='bold');
##################################
# Formulating the mean and standard deviation
# scatterplot distribution
# by category of the image pixel values
##################################
scatterplot = sns.FacetGrid(imageEDA, col="Class", height=6)
scatterplot.map_dataframe(sns.scatterplot, x='Mean', y='StDev', alpha=0.5)
scatterplot.set_titles(col_template="{col_name}", row_template="{row_name}", size=18)
scatterplot.fig.subplots_adjust(top=.8)
scatterplot.fig.suptitle('Image Pixel Intensity Mean and Standard Deviation Distribution', fontsize=14, weight='bold')
axes = scatterplot.axes.flatten()
axes[0].set_ylabel('Image Pixel Intensity Standard Deviation')
for ax in axes:
ax.set_xlabel('Image Pixel Intensity Mean')
scatterplot.fig.tight_layout()
##################################
# Formulating the mean and standard deviation
# scatterplot distribution
# of the image pixel values
# represented as actual images
##################################
def getImage(path):
return OffsetImage(cv2.imread(path),zoom = 0.1)
DF_sample = imageEDA.sample(frac=1.0, replace=False, random_state=123)
paths = DF_sample['Path']
fig, ax = plt.subplots(figsize=(15,9))
ab = sns.scatterplot(data=DF_sample, x="Mean", y='StDev')
sns.despine(top=True, right=True, left=False, bottom=False)
ax.set_xlabel('Image Pixel Intensity Mean', fontsize=14, weight='bold')
ax.set_ylabel('Image Pixel Intensity Standard Deviation', fontsize=14, weight='bold')
ax.set_xlim(40,220)
ax.set_ylim(10,110)
plt.title('Overall: Image Pixel Intensity Mean and Standard Deviation Distribution', fontsize=14, weight='bold');
for x0, y0, path in zip(DF_sample['Mean'], DF_sample['StDev'],paths):
ab = AnnotationBbox(getImage(path), (x0, y0), frameon=False)
ax.add_artist(ab)
##################################
# Formulating the mean and standard deviation
# scatterplot distribution
# of the image pixel values
# represented as actual images
# for the Covid-19 class
##################################
path_covid = 'C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset/COVID/'
imageEDA_covid = imageEDA.loc[imageEDA['Class'] == 'Covid-19']
DF_sample = imageEDA_covid.sample(frac=1.0, replace=False, random_state=123)
paths = DF_sample['Path']
fig, ax = plt.subplots(figsize=(15,9))
ab = sns.scatterplot(data=DF_sample, x="Mean", y='StDev')
sns.despine(top=True, right=True, left=False, bottom=False)
ax.set_xlabel('Image Pixel Intensity Mean', fontsize=14, weight='bold')
ax.set_ylabel('Image Pixel Intensity Standard Deviation', fontsize=14, weight='bold')
ax.set_xlim(40,220)
ax.set_ylim(10,110)
plt.title('Covid-19: Image Pixel Intensity Mean and Standard Deviation Distribution', fontsize=14, weight='bold');
for x0, y0, path_covid in zip(DF_sample['Mean'], DF_sample['StDev'],paths):
ab = AnnotationBbox(getImage(path_covid), (x0, y0), frameon=False)
ax.add_artist(ab)
##################################
# Formulating the mean and standard deviation
# scatterplot distribution
# of the image pixel values
# represented as actual images
# for the Viral Pneumonia class
##################################
path_viral_pneumonia = 'C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset/Viral Pneumonia/'
imageEDA_viral_pneumonia = imageEDA.loc[imageEDA['Class'] == 'Viral Pneumonia']
DF_sample = imageEDA_viral_pneumonia.sample(frac=1.0, replace=False, random_state=123)
paths = DF_sample['Path']
fig, ax = plt.subplots(figsize=(15,9))
ab = sns.scatterplot(data=DF_sample, x="Mean", y='StDev')
sns.despine(top=True, right=True, left=False, bottom=False)
ax.set_xlabel('Image Pixel Intensity Mean', fontsize=14, weight='bold')
ax.set_ylabel('Image Pixel Intensity Standard Deviation', fontsize=14, weight='bold')
ax.set_xlim(40,220)
ax.set_ylim(10,110)
plt.title('Viral Pneumonia: Image Pixel Intensity Mean and Standard Deviation Distribution', fontsize=14, weight='bold');
for x0, y0, path_viral_pneumonia in zip(DF_sample['Mean'], DF_sample['StDev'],paths):
ab = AnnotationBbox(getImage(path_viral_pneumonia), (x0, y0), frameon=False)
ax.add_artist(ab)
##################################
# Formulating the mean and standard deviation
# scatterplot distribution
# of the image pixel values
# represented as actual images
# for the Normal class
##################################
path_normal = 'C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset/Normal/'
imageEDA_normal = imageEDA.loc[imageEDA['Class'] == 'Healthy']
DF_sample = imageEDA_normal.sample(frac=1.0, replace=False, random_state=123)
paths = DF_sample['Path']
fig, ax = plt.subplots(figsize=(15,9))
ab = sns.scatterplot(data=DF_sample, x="Mean", y='StDev')
sns.despine(top=True, right=True, left=False, bottom=False)
ax.set_xlabel('Image Pixel Intensity Mean', fontsize=14, weight='bold')
ax.set_ylabel('Image Pixel Intensity Standard Deviation', fontsize=14, weight='bold')
ax.set_xlim(40,220)
ax.set_ylim(10,110)
plt.title('Healthy: Image Pixel Intensity Mean and Standard Deviation Distribution', fontsize=14, weight='bold');
for x0, y0, path_normal in zip(DF_sample['Mean'], DF_sample['StDev'],paths):
ab = AnnotationBbox(getImage(path_normal), (x0, y0), frameon=False)
ax.add_artist(ab)
##################################
# Formulating the minimum and standard deviation
# scatterplot distribution
# of the image pixel values
# represented as actual images
##################################
DF_sample = imageEDA.sample(frac=1.0, replace=False, random_state=123)
paths = DF_sample['Path']
fig, ax = plt.subplots(figsize=(15,9))
ab = sns.scatterplot(data=DF_sample, x="Min", y='StDev')
sns.despine(top=True, right=True, left=False, bottom=False)
ax.set_xlabel('Image Pixel Intensity Minimum', fontsize=14, weight='bold')
ax.set_ylabel('Image Pixel Intensity Standard Deviation', fontsize=14, weight='bold')
ax.set_xlim(-20,150)
ax.set_ylim(10,110)
plt.title('Overall: Image Pixel Intensity Minimum and Standard Deviation Distribution', fontsize=14, weight='bold');
for x0, y0, path in zip(DF_sample['Min'], DF_sample['StDev'],paths):
ab = AnnotationBbox(getImage(path), (x0, y0), frameon=False)
ax.add_artist(ab)
##################################
# Formulating the minimum and standard deviation
# scatterplot distribution
# of the image pixel values
# represented as actual images
# for the Covid-19 class
##################################
path_covid = 'C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset/COVID/'
imageEDA_covid = imageEDA.loc[imageEDA['Class'] == 'Covid-19']
DF_sample = imageEDA_covid.sample(frac=1.0, replace=False, random_state=123)
paths = DF_sample['Path']
fig, ax = plt.subplots(figsize=(15,9))
ab = sns.scatterplot(data=DF_sample, x="Min", y='StDev')
sns.despine(top=True, right=True, left=False, bottom=False)
ax.set_xlabel('Image Pixel Intensity Minimum', fontsize=14, weight='bold')
ax.set_ylabel('Image Pixel Intensity Standard Deviation', fontsize=14, weight='bold')
ax.set_xlim(-20,150)
ax.set_ylim(10,110)
plt.title('Covid-19: Image Pixel Intensity Minimum and Standard Deviation Distribution', fontsize=14, weight='bold');
for x0, y0, path in zip(DF_sample['Min'], DF_sample['StDev'],paths):
ab = AnnotationBbox(getImage(path), (x0, y0), frameon=False)
ax.add_artist(ab)
##################################
# Formulating the minimum and standard deviation
# scatterplot distribution
# of the image pixel values
# represented as actual images
# for the Viral Pneumonia class
##################################
path_viral_pneumonia = 'C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset/Viral Pneumonia/'
imageEDA_viral_pneumonia = imageEDA.loc[imageEDA['Class'] == 'Viral Pneumonia']
DF_sample = imageEDA_viral_pneumonia.sample(frac=1.0, replace=False, random_state=123)
paths = DF_sample['Path']
fig, ax = plt.subplots(figsize=(15,9))
ab = sns.scatterplot(data=DF_sample, x="Min", y='StDev')
sns.despine(top=True, right=True, left=False, bottom=False)
ax.set_xlabel('Image Pixel Intensity Minimum', fontsize=14, weight='bold')
ax.set_ylabel('Image Pixel Intensity Standard Deviation', fontsize=14, weight='bold')
ax.set_xlim(-20,150)
ax.set_ylim(10,110)
plt.title('Viral Pneumonia: Image Pixel Intensity Minimum and Standard Deviation Distribution', fontsize=14, weight='bold');
for x0, y0, path in zip(DF_sample['Min'], DF_sample['StDev'],paths):
ab = AnnotationBbox(getImage(path), (x0, y0), frameon=False)
ax.add_artist(ab)
##################################
# Formulating the minimum and standard deviation
# scatterplot distribution
# of the image pixel values
# represented as actual images
# for the Normal class
##################################
path_normal = 'C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset/Normal/'
imageEDA_normal = imageEDA.loc[imageEDA['Class'] == 'Healthy']
DF_sample = imageEDA_normal.sample(frac=1.0, replace=False, random_state=123)
paths = DF_sample['Path']
fig, ax = plt.subplots(figsize=(15,9))
ab = sns.scatterplot(data=DF_sample, x="Min", y='StDev')
sns.despine(top=True, right=True, left=False, bottom=False)
ax.set_xlabel('Image Pixel Intensity Minimum', fontsize=14, weight='bold')
ax.set_ylabel('Image Pixel Intensity Standard Deviation', fontsize=14, weight='bold')
ax.set_xlim(-20,150)
ax.set_ylim(10,110)
plt.title('Normal: Image Pixel Intensity Minimum and Standard Deviation Distribution', fontsize=14, weight='bold');
for x0, y0, path in zip(DF_sample['Min'], DF_sample['StDev'],paths):
ab = AnnotationBbox(getImage(path), (x0, y0), frameon=False)
ax.add_artist(ab)
##################################
# Formulating the maximum and standard deviation
# scatterplot distribution
# of the image pixel values
# represented as actual images
##################################
def getImage(path):
return OffsetImage(cv2.imread(path),zoom = 0.1)
DF_sample = imageEDA.sample(frac=1.0, replace=False, random_state=123)
paths = DF_sample['Path']
fig, ax = plt.subplots(figsize=(15,9))
ab = sns.scatterplot(data=DF_sample, x="Max", y='StDev')
sns.despine(top=True, right=True, left=False, bottom=False)
ax.set_xlabel('Image Pixel Intensity Maximum', fontsize=14, weight='bold')
ax.set_ylabel('Image Pixel Intensity Standard Deviation', fontsize=14, weight='bold')
ax.set_xlim(100,270)
ax.set_ylim(10,110)
plt.title('Overall: Image Pixel Intensity Maximum and Standard Deviation Distribution', fontsize=14, weight='bold');
for x0, y0, path in zip(DF_sample['Max'], DF_sample['StDev'],paths):
ab = AnnotationBbox(getImage(path), (x0, y0), frameon=False)
ax.add_artist(ab)
##################################
# Formulating the maximum and standard deviation
# scatterplot distribution
# of the image pixel values
# represented as actual images
# for the Covid-19 class
##################################
path_covid = 'C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset/COVID/'
imageEDA_covid = imageEDA.loc[imageEDA['Class'] == 'Covid-19']
DF_sample = imageEDA_covid.sample(frac=1.0, replace=False, random_state=123)
paths = DF_sample['Path']
fig, ax = plt.subplots(figsize=(15,9))
ab = sns.scatterplot(data=DF_sample, x="Max", y='StDev')
sns.despine(top=True, right=True, left=False, bottom=False)
ax.set_xlabel('Image Pixel Intensity Maximum', fontsize=14, weight='bold')
ax.set_ylabel('Image Pixel Intensity Standard Deviation', fontsize=14, weight='bold')
ax.set_xlim(100,270)
ax.set_ylim(10,110)
plt.title('Covid-19: Image Pixel Intensity Maximum and Standard Deviation Distribution', fontsize=14, weight='bold');
for x0, y0, path in zip(DF_sample['Max'], DF_sample['StDev'],paths):
ab = AnnotationBbox(getImage(path), (x0, y0), frameon=False)
ax.add_artist(ab)
##################################
# Formulating the minimum and standard deviation
# scatterplot distribution
# of the image pixel values
# represented as actual images
# for the Viral Pneumonia class
##################################
path_viral_pneumonia = 'C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset/Viral Pneumonia/'
imageEDA_viral_pneumonia = imageEDA.loc[imageEDA['Class'] == 'Viral Pneumonia']
DF_sample = imageEDA_viral_pneumonia.sample(frac=1.0, replace=False, random_state=123)
paths = DF_sample['Path']
fig, ax = plt.subplots(figsize=(15,9))
ab = sns.scatterplot(data=DF_sample, x="Max", y='StDev')
sns.despine(top=True, right=True, left=False, bottom=False)
ax.set_xlabel('Image Pixel Intensity Maximum', fontsize=14, weight='bold')
ax.set_ylabel('Image Pixel Intensity Standard Deviation', fontsize=14, weight='bold')
ax.set_xlim(100,270)
ax.set_ylim(10,110)
plt.title('Viral Pneumonia: Image Pixel Intensity Maximum and Standard Deviation Distribution', fontsize=14, weight='bold');
for x0, y0, path in zip(DF_sample['Max'], DF_sample['StDev'],paths):
ab = AnnotationBbox(getImage(path), (x0, y0), frameon=False)
ax.add_artist(ab)
##################################
# Formulating the maximum and standard deviation
# scatterplot distribution
# of the image pixel values
# represented as actual images
# for the Normal class
##################################
path_normal = 'C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset/Normal/'
imageEDA_normal = imageEDA.loc[imageEDA['Class'] == 'Healthy']
DF_sample = imageEDA_normal.sample(frac=1.0, replace=False, random_state=123)
paths = DF_sample['Path']
fig, ax = plt.subplots(figsize=(15,9))
ab = sns.scatterplot(data=DF_sample, x="Max", y='StDev')
sns.despine(top=True, right=True, left=False, bottom=False)
ax.set_xlabel('Image Pixel Intensity Maximum', fontsize=14, weight='bold')
ax.set_ylabel('Image Pixel Intensity Standard Deviation', fontsize=14, weight='bold')
ax.set_xlim(100,270)
ax.set_ylim(10,110)
plt.title('Normal: Image Pixel Intensity Maximum and Standard Deviation Distribution', fontsize=14, weight='bold');
for x0, y0, path in zip(DF_sample['Max'], DF_sample['StDev'],paths):
ab = AnnotationBbox(getImage(path), (x0, y0), frameon=False)
ax.add_artist(ab)
##################################
# Defining a function for
# plotting the loss profile
# of the training and validation sets
#################################
def plot_training_history(history, model_name):
plt.figure(figsize=(10,6))
plt.plot(history.history['loss'], label='Train')
plt.plot(history.history['val_loss'], label='Validation')
plt.title(f'{model_name} Training Loss', fontsize = 16, weight = 'bold', pad=20)
plt.ylim(0, 5)
plt.xlabel('Epoch', fontsize = 14, weight = 'bold',)
plt.ylabel('Loss', fontsize = 14, weight = 'bold',)
plt.legend(loc='upper right')
plt.show()
##################################
# Formulating the network architecture
# for CNN with no regularization
##################################
set_seed()
batch_size = 16
model_nr = Sequential()
model_nr.add(Conv2D(filters=32, kernel_size=(3, 3), activation='relu', padding='Same', input_shape=(299, 299, 1)))
model_nr.add(MaxPooling2D(pool_size=(2, 2)))
model_nr.add(Conv2D(filters=64, kernel_size=(3, 3), padding='Same', activation='relu'))
model_nr.add(MaxPooling2D(pool_size=(2, 2)))
model_nr.add(Flatten())
model_nr.add(Dense(units=128, activation='relu'))
model_nr.add(Dense(units=num_classes, activation='softmax'))
##################################
# Compiling the network layers
##################################
model_nr.compile(loss='categorical_crossentropy', optimizer='adam', metrics=[Recall()])
WARNING:tensorflow:From C:\Users\John pauline magno\AppData\Roaming\Python\Python311\site-packages\keras\backend.py:873: The name tf.get_default_graph is deprecated. Please use tf.compat.v1.get_default_graph instead. WARNING:tensorflow:From C:\Users\John pauline magno\AppData\Roaming\Python\Python311\site-packages\keras\layers\pooling\max_pooling2d.py:160: The name tf.nn.max_pool is deprecated. Please use tf.nn.max_pool2d instead. WARNING:tensorflow:From C:\Users\John pauline magno\AppData\Roaming\Python\Python311\site-packages\keras\optimizers\__init__.py:300: The name tf.train.Optimizer is deprecated. Please use tf.compat.v1.train.Optimizer instead.
##################################
# Displaying the model summary
# for CNN with no regularization
##################################
print(model_nr.summary())
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 299, 299, 32) 320
max_pooling2d (MaxPooling2D (None, 149, 149, 32) 0
)
conv2d_1 (Conv2D) (None, 149, 149, 64) 18496
max_pooling2d_1 (MaxPooling (None, 74, 74, 64) 0
2D)
flatten (Flatten) (None, 350464) 0
dense (Dense) (None, 128) 44859520
dense_1 (Dense) (None, 3) 387
=================================================================
Total params: 44,878,723
Trainable params: 44,878,723
Non-trainable params: 0
_________________________________________________________________
None
##################################
# Displaying the model layers
# for CNN with no regularization
##################################
model_nr_layer_names = [layer.name for layer in model_nr.layers]
print("Layer Names:", model_nr_layer_names)
Layer Names: ['conv2d', 'max_pooling2d', 'conv2d_1', 'max_pooling2d_1', 'flatten', 'dense', 'dense_1']
##################################
# Displaying the number of weights
# for each model layer
# for CNN with no regularization
##################################
for layer in model_nr.layers:
if hasattr(layer, 'weights'):
print(f"Layer: {layer.name}, Number of Weights: {len(layer.get_weights())}")
Layer: conv2d, Number of Weights: 2 Layer: max_pooling2d, Number of Weights: 0 Layer: conv2d_1, Number of Weights: 2 Layer: max_pooling2d_1, Number of Weights: 0 Layer: flatten, Number of Weights: 0 Layer: dense, Number of Weights: 2 Layer: dense_1, Number of Weights: 2
##################################
# Displaying the number of weights
# for each model layer
# for CNN with no regularization
##################################
total_parameters = 0
for layer in model_nr.layers:
layer_parameters = layer.count_params()
total_parameters += layer_parameters
print(f"Layer: {layer.name}, Parameters: {layer_parameters}")
print("\nTotal Parameters in the Model:", total_parameters)
Layer: conv2d, Parameters: 320 Layer: max_pooling2d, Parameters: 0 Layer: conv2d_1, Parameters: 18496 Layer: max_pooling2d_1, Parameters: 0 Layer: flatten, Parameters: 0 Layer: dense, Parameters: 44859520 Layer: dense_1, Parameters: 387 Total Parameters in the Model: 44878723
##################################
# Fitting the model
# for CNN with no regularization
##################################
epochs = 100
set_seed()
model_nr_history = model_nr.fit(train_gen,
steps_per_epoch=len(train_gen) // batch_size,
validation_steps=len(test_gen) // batch_size,
validation_data=test_gen,
epochs=epochs,
verbose=0)
WARNING:tensorflow:From C:\Users\John pauline magno\AppData\Roaming\Python\Python311\site-packages\keras\utils\tf_utils.py:490: The name tf.ragged.RaggedTensorValue is deprecated. Please use tf.compat.v1.ragged.RaggedTensorValue instead.
##################################
# Evaluating the model
# for CNN with no regularization
# on the independent validation set
##################################
model_nr_y_pred = model_nr.predict(test_gen)
45/45 [==============================] - 4s 76ms/step
##################################
# Plotting the loss profile
# for CNN with no regularization
# on the training and validation sets
##################################
plot_training_history(model_nr_history, 'CNN With No Regularization : ')
##################################
# Consolidating the predictions
# for CNN with no regularization
# on the validation set
##################################
model_nr_predictions = np.array(list(map(lambda x: np.argmax(x), model_nr_y_pred)))
model_nr_y_true = test_gen.classes
##################################
# Formulating the confusion matrix
# for CNN with no regularization
# on the validation set
##################################
CMatrix = pd.DataFrame(confusion_matrix(model_nr_y_true, model_nr_predictions), columns=classes, index =classes)
##################################
# Plotting the confusion matrix
# for CNN with no regularization
# on the validation set
##################################
plt.figure(figsize=(10, 6))
ax = sns.heatmap(CMatrix, annot = True, fmt = 'g' ,vmin = 0, vmax = 250,cmap = 'icefire')
ax.set_xlabel('Predicted',fontsize = 14,weight = 'bold')
ax.set_xticklabels(ax.get_xticklabels(),rotation =0)
ax.set_ylabel('Actual',fontsize = 14,weight = 'bold')
ax.set_yticklabels(ax.get_yticklabels(),rotation =0)
ax.set_title('CNN With No Regularization : Validation Set Confusion Matrix',fontsize = 14, weight = 'bold',pad=20);
##################################
# Resetting all states generated by Keras
##################################
keras.backend.clear_session()
##################################
# Calculating the model accuracy
# for CNN with no regularization
# for the entire validation set
##################################
model_nr_acc = accuracy_score(model_nr_y_true, model_nr_predictions)
##################################
# Calculating the model
# Precision, Recall, F-score and Support
# for CNN with no regularization
# for the entire validation set
##################################
model_nr_results_all = precision_recall_fscore_support(model_nr_y_true, model_nr_predictions, average='macro',zero_division = 1)
##################################
# Calculating the model
# Precision, Recall, F-score and Support
# for CNN with no regularization
# for each category of the validation set
##################################
model_nr_results_class = precision_recall_fscore_support(model_nr_y_true, model_nr_predictions, average=None, zero_division = 1)
##################################
# Consolidating all model evaluation metrics
# for CNN with no regularization
##################################
metric_columns = ['Precision','Recall','F-Score','Support']
model_nr_all_df = pd.concat([pd.DataFrame(list(model_nr_results_class)).T,pd.DataFrame(list(model_nr_results_all)).T])
model_nr_all_df.columns = metric_columns
model_nr_all_df.index = ['COVID', 'Normal', 'Viral Pneumonia','Total']
model_nr_all_df
| Precision | Recall | F-Score | Support | |
|---|---|---|---|---|
| COVID | 0.890295 | 0.879167 | 0.884696 | 240.0 |
| Normal | 0.671378 | 0.791667 | 0.726577 | 240.0 |
| Viral Pneumonia | 0.810000 | 0.675000 | 0.736364 | 240.0 |
| Total | 0.790558 | 0.781944 | 0.782546 | NaN |
##################################
# Consolidating all model evaluation metrics
# for CNN with no regularization
##################################
model_nr_model_list = []
model_nr_measure_list = []
model_nr_category_list = []
model_nr_value_list = []
for i in range(3):
for j in range(4):
model_nr_model_list.append('CNN_NR')
model_nr_measure_list.append(metric_columns[i])
model_nr_category_list.append(model_nr_all_df.index[j])
model_nr_value_list.append(model_nr_all_df.iloc[j,i])
model_nr_all_summary = pd.DataFrame(zip(model_nr_model_list,
model_nr_measure_list,
model_nr_category_list,
model_nr_value_list),
columns=['CNN.Model.Name',
'Model.Metric',
'Image.Category',
'Metric.Value'])
##################################
# Formulating the network architecture
# for CNN with dropout regularization
##################################
set_seed()
batch_size = 16
model_dr = Sequential()
model_dr.add(Conv2D(filters=32, kernel_size=(3, 3), activation='relu', padding='Same', input_shape=(299, 299, 1)))
model_dr.add(MaxPooling2D(pool_size=(2, 2)))
model_dr.add(Conv2D(filters=64, kernel_size=(3, 3), padding = 'Same', activation='relu'))
model_dr.add(Dropout(rate=0.25))
model_dr.add(MaxPooling2D(pool_size=(2, 2)))
model_dr.add(Flatten())
model_dr.add(Dense(units=128, activation='relu'))
model_dr.add(Dense(units=num_classes, activation='softmax'))
##################################
# Compiling the network layers
##################################
model_dr.compile(loss='categorical_crossentropy', optimizer='adam', metrics=[Recall()])
##################################
# Displaying the model summary
# for CNN with dropout regularization
##################################
print(model_dr.summary())
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 299, 299, 32) 320
max_pooling2d (MaxPooling2D (None, 149, 149, 32) 0
)
conv2d_1 (Conv2D) (None, 149, 149, 64) 18496
dropout (Dropout) (None, 149, 149, 64) 0
max_pooling2d_1 (MaxPooling (None, 74, 74, 64) 0
2D)
flatten (Flatten) (None, 350464) 0
dense (Dense) (None, 128) 44859520
dense_1 (Dense) (None, 3) 387
=================================================================
Total params: 44,878,723
Trainable params: 44,878,723
Non-trainable params: 0
_________________________________________________________________
None
##################################
# Displaying the model layers
# for CNN with dropout regularization
##################################
model_dr_layer_names = [layer.name for layer in model_dr.layers]
print("Layer Names:", model_dr_layer_names)
Layer Names: ['conv2d', 'max_pooling2d', 'conv2d_1', 'dropout', 'max_pooling2d_1', 'flatten', 'dense', 'dense_1']
##################################
# Displaying the number of weights
# for each model layer
# for CNN with dropout regularization
##################################
for layer in model_dr.layers:
if hasattr(layer, 'weights'):
print(f"Layer: {layer.name}, Number of Weights: {len(layer.get_weights())}")
Layer: conv2d, Number of Weights: 2 Layer: max_pooling2d, Number of Weights: 0 Layer: conv2d_1, Number of Weights: 2 Layer: dropout, Number of Weights: 0 Layer: max_pooling2d_1, Number of Weights: 0 Layer: flatten, Number of Weights: 0 Layer: dense, Number of Weights: 2 Layer: dense_1, Number of Weights: 2
##################################
# Displaying the number of weights
# for each model layer
# for CNN with dropout regularization
##################################
total_parameters = 0
for layer in model_dr.layers:
layer_parameters = layer.count_params()
total_parameters += layer_parameters
print(f"Layer: {layer.name}, Parameters: {layer_parameters}")
print("\nTotal Parameters in the Model:", total_parameters)
Layer: conv2d, Parameters: 320 Layer: max_pooling2d, Parameters: 0 Layer: conv2d_1, Parameters: 18496 Layer: dropout, Parameters: 0 Layer: max_pooling2d_1, Parameters: 0 Layer: flatten, Parameters: 0 Layer: dense, Parameters: 44859520 Layer: dense_1, Parameters: 387 Total Parameters in the Model: 44878723
##################################
# Fitting the model
# for CNN with dropout regularization
##################################
epochs = 100
set_seed()
model_dr_history = model_dr.fit(train_gen,
steps_per_epoch=len(train_gen) // batch_size,
validation_steps=len(test_gen) // batch_size,
validation_data=test_gen,
epochs=epochs,
verbose=0)
##################################
# Evaluating the model
# for CNN with dropout regularization
# on the independent validation set
##################################
model_dr_y_pred = model_dr.predict(test_gen)
45/45 [==============================] - 4s 77ms/step
##################################
# Plotting the loss profile
# for CNN with dropout regularization
# on the training and validation sets
##################################
plot_training_history(model_dr_history, 'CNN With Dropout Regularization : ')
##################################
# Consolidating the predictions
# for CNN with dropout regularization
# on the validation set
##################################
model_dr_predictions = np.array(list(map(lambda x: np.argmax(x), model_dr_y_pred)))
model_dr_y_true=test_gen.classes
##################################
# Formulating the confusion matrix
# for CNN with dropout regularization
# on the validation set
##################################
CMatrix = pd.DataFrame(confusion_matrix(model_dr_y_true, model_dr_predictions), columns=classes, index =classes)
##################################
# Calculating the model
# Precision, Recall, F-score and Support
# for CNN with dropout regularization
# for each category of the validation set
##################################
plt.figure(figsize=(10, 6))
ax = sns.heatmap(CMatrix, annot = True, fmt = 'g' ,vmin = 0, vmax = 250, cmap = 'icefire')
ax.set_xlabel('Predicted',fontsize = 14,weight = 'bold')
ax.set_xticklabels(ax.get_xticklabels(),rotation =0)
ax.set_ylabel('Actual',fontsize = 14,weight = 'bold')
ax.set_yticklabels(ax.get_yticklabels(),rotation =0)
ax.set_title('CNN With Dropout Regularization : Validation Set Confusion Matrix',fontsize = 14, weight = 'bold', pad=20);
##################################
# Resetting all states generated by Keras
##################################
keras.backend.clear_session()
##################################
# Calculating the model accuracy
# for CNN with dropout regularization
# for the entire validation set
##################################
model_dr_acc = accuracy_score(model_dr_y_true, model_dr_predictions)
##################################
# Calculating the model
# Precision, Recall, F-score and Support
# for CNN with dropout regularization
# for the entire validation set
##################################
model_dr_results_all = precision_recall_fscore_support(model_dr_y_true, model_dr_predictions, average='macro',zero_division = 1)
##################################
# Calculating the model
# Precision, Recall, F-score and Support
# for CNN with dropout regularization
# for each category of the validation set
##################################
model_dr_results_class = precision_recall_fscore_support(model_dr_y_true, model_dr_predictions, average=None, zero_division = 1)
##################################
# Consolidating all model evaluation metrics
# for CNN with dropout regularization
##################################
metric_columns = ['Precision','Recall', 'F-Score','Support']
model_dr_all_df = pd.concat([pd.DataFrame(list(model_dr_results_class)).T,pd.DataFrame(list(model_dr_results_all)).T])
model_dr_all_df.columns = metric_columns
model_dr_all_df.index = ['COVID', 'Normal', 'Viral Pneumonia','Total']
model_dr_all_df
| Precision | Recall | F-Score | Support | |
|---|---|---|---|---|
| COVID | 0.906504 | 0.929167 | 0.917695 | 240.0 |
| Normal | 0.904255 | 0.708333 | 0.794393 | 240.0 |
| Viral Pneumonia | 0.758741 | 0.904167 | 0.825095 | 240.0 |
| Total | 0.856500 | 0.847222 | 0.845728 | NaN |
##################################
# Consolidating all model evaluation metrics
# for CNN with dropout regularization
##################################
model_dr_model_list = []
model_dr_measure_list = []
model_dr_category_list = []
model_dr_value_list = []
for i in range(3):
for j in range(4):
model_dr_model_list.append('CNN_DR')
model_dr_measure_list.append(metric_columns[i])
model_dr_category_list.append(model_dr_all_df.index[j])
model_dr_value_list.append(model_dr_all_df.iloc[j,i])
model_dr_all_summary = pd.DataFrame(zip(model_dr_model_list,
model_dr_measure_list,
model_dr_category_list,
model_dr_value_list),
columns=['CNN.Model.Name',
'Model.Metric',
'Image.Category',
'Metric.Value'])
##################################
# Formulating the network architecture
# for CNN with batch normalization regularization
##################################
set_seed()
batch_size = 16
model_bnr = Sequential()
model_bnr.add(Conv2D(filters=32, kernel_size=(3, 3), activation='relu', padding='Same', input_shape=(299, 299, 1)))
model_bnr.add(MaxPooling2D(pool_size=(2, 2)))
model_bnr.add(Conv2D(filters=64, kernel_size=(3, 3), padding='Same', activation='relu'))
model_bnr.add(BatchNormalization())
model_bnr.add(Activation('relu'))
model_bnr.add(MaxPooling2D(pool_size=(2, 2)))
model_bnr.add(Flatten())
model_bnr.add(Dense(units=128, activation='relu'))
model_bnr.add(Dense(units=num_classes, activation='softmax'))
##################################
# Compiling the network layers
##################################
model_bnr.compile(loss='categorical_crossentropy', optimizer='adam', metrics=[Recall()])
##################################
# Displaying the model summary
# for CNN with batch normalization regularization
##################################
print(model_bnr.summary())
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 299, 299, 32) 320
max_pooling2d (MaxPooling2D (None, 149, 149, 32) 0
)
conv2d_1 (Conv2D) (None, 149, 149, 64) 18496
batch_normalization (BatchN (None, 149, 149, 64) 256
ormalization)
activation (Activation) (None, 149, 149, 64) 0
max_pooling2d_1 (MaxPooling (None, 74, 74, 64) 0
2D)
flatten (Flatten) (None, 350464) 0
dense (Dense) (None, 128) 44859520
dense_1 (Dense) (None, 3) 387
=================================================================
Total params: 44,878,979
Trainable params: 44,878,851
Non-trainable params: 128
_________________________________________________________________
None
##################################
# Displaying the model layers
# for CNN with batch normalization regularization
##################################
model_bnr_layer_names = [layer.name for layer in model_bnr.layers]
print("Layer Names:", model_bnr_layer_names)
Layer Names: ['conv2d', 'max_pooling2d', 'conv2d_1', 'batch_normalization', 'activation', 'max_pooling2d_1', 'flatten', 'dense', 'dense_1']
##################################
# Displaying the number of weights
# for each model layer
# for CNN with batch normalization regularization
##################################
for layer in model_bnr.layers:
if hasattr(layer, 'weights'):
print(f"Layer: {layer.name}, Number of Weights: {len(layer.get_weights())}")
Layer: conv2d, Number of Weights: 2 Layer: max_pooling2d, Number of Weights: 0 Layer: conv2d_1, Number of Weights: 2 Layer: batch_normalization, Number of Weights: 4 Layer: activation, Number of Weights: 0 Layer: max_pooling2d_1, Number of Weights: 0 Layer: flatten, Number of Weights: 0 Layer: dense, Number of Weights: 2 Layer: dense_1, Number of Weights: 2
##################################
# Displaying the number of weights
# for each model layer
# for CNN with batch normalization regularization
##################################
total_parameters = 0
for layer in model_bnr.layers:
layer_parameters = layer.count_params()
total_parameters += layer_parameters
print(f"Layer: {layer.name}, Parameters: {layer_parameters}")
print("\nTotal Parameters in the Model:", total_parameters)
Layer: conv2d, Parameters: 320 Layer: max_pooling2d, Parameters: 0 Layer: conv2d_1, Parameters: 18496 Layer: batch_normalization, Parameters: 256 Layer: activation, Parameters: 0 Layer: max_pooling2d_1, Parameters: 0 Layer: flatten, Parameters: 0 Layer: dense, Parameters: 44859520 Layer: dense_1, Parameters: 387 Total Parameters in the Model: 44878979
##################################
# Fitting the model
# for CNN with batch normalization regularization
##################################
epochs = 100
set_seed()
model_bnr_history = model_bnr.fit(train_gen,
steps_per_epoch=len(train_gen) // batch_size,
validation_steps=len(test_gen) // batch_size,
validation_data=test_gen, epochs=epochs,
verbose=0)
##################################
# Evaluating the model
# for CNN with batch normalization regularization
# on the independent validation set
##################################
model_bnr_y_pred = model_bnr.predict(test_gen)
45/45 [==============================] - 4s 85ms/step
##################################
# Plotting the loss profile
# for CNN with batch normalization regularization
# on the training and validation sets
##################################
plot_training_history(model_bnr_history, 'CNN With Batch Normalization Regularization : ')
##################################
# Consolidating the predictions
# for CNN with batch normalization regularization
# on the validation set
##################################
model_bnr_predictions = np.array(list(map(lambda x: np.argmax(x), model_bnr_y_pred)))
model_bnr_y_true = test_gen.classes
##################################
# Formulating the confusion matrix
# for CNN with batch normalization regularization
# on the validation set
##################################
CMatrix = pd.DataFrame(confusion_matrix(model_bnr_y_true, model_bnr_predictions), columns=classes, index =classes)
##################################
# Calculating the model
# Precision, Recall, F-score and Support
# for CNN with batch normalization regularization
# for each category of the validation set
##################################
plt.figure(figsize=(10, 6))
ax = sns.heatmap(CMatrix, annot = True, fmt = 'g' ,vmin = 0, vmax = 250,cmap = 'icefire')
ax.set_xlabel('Predicted',fontsize = 14,weight = 'bold')
ax.set_xticklabels(ax.get_xticklabels(),rotation =0)
ax.set_ylabel('Actual',fontsize = 14,weight = 'bold')
ax.set_yticklabels(ax.get_yticklabels(),rotation =0)
ax.set_title('CNN With Batch Normalization Regularization : Validation Set Confusion Matrix',fontsize = 16,weight = 'bold',pad=20);
##################################
# Resetting all states generated by Keras
##################################
keras.backend.clear_session()
##################################
# Calculating the model accuracy
# for CNN with batch normalization regularization
# for the entire validation set
##################################
model_bnr_acc = accuracy_score(model_bnr_y_true, model_bnr_predictions)
##################################
# Calculating the model
# Precision, Recall, F-score and Support
# for CNN with batch normalization regularization
# for the entire validation set
##################################
model_bnr_results_all = precision_recall_fscore_support(model_bnr_y_true, model_bnr_predictions, average='macro',zero_division = 1)
##################################
# Calculating the model
# Precision, Recall, F-score and Support
# for CNN with batch normalization regularization
# for each category of the validation set
##################################
model_bnr_results_class = precision_recall_fscore_support(model_bnr_y_true, model_bnr_predictions, average=None, zero_division = 1)
##################################
# Consolidating all model evaluation metrics
# for CNN with batch normalization regularization
##################################
metric_columns = ['Precision','Recall', 'F-Score','Support']
model_bnr_all_df = pd.concat([pd.DataFrame(list(model_bnr_results_class)).T,pd.DataFrame(list(model_bnr_results_all)).T])
model_bnr_all_df.columns = metric_columns
model_bnr_all_df.index = ['COVID', 'Normal', 'Viral Pneumonia','Total']
model_bnr_all_df
| Precision | Recall | F-Score | Support | |
|---|---|---|---|---|
| COVID | 0.943231 | 0.900000 | 0.921109 | 240.0 |
| Normal | 0.876448 | 0.945833 | 0.909820 | 240.0 |
| Viral Pneumonia | 0.913793 | 0.883333 | 0.898305 | 240.0 |
| Total | 0.911157 | 0.909722 | 0.909744 | NaN |
##################################
# Consolidating all model evaluation metrics
# for CNN with batch normalization regularization
##################################
model_bnr_model_list = []
model_bnr_measure_list = []
model_bnr_category_list = []
model_bnr_value_list = []
for i in range(3):
for j in range(4):
model_bnr_model_list.append('CNN_BNR')
model_bnr_measure_list.append(metric_columns[i])
model_bnr_category_list.append(model_bnr_all_df.index[j])
model_bnr_value_list.append(model_bnr_all_df.iloc[j,i])
model_bnr_all_summary = pd.DataFrame(zip(model_bnr_model_list,
model_bnr_measure_list,
model_bnr_category_list,
model_bnr_value_list),
columns=['CNN.Model.Name',
'Model.Metric',
'Image.Category',
'Metric.Value'])
##################################
# Formulating the network architecture
# for CNN with dropout and batch normalization regularization
##################################
set_seed()
batch_size = 16
model_dr_bnr = Sequential()
model_dr_bnr.add(Conv2D(filters=32, kernel_size=(3, 3), activation='relu', padding='Same', input_shape=(299, 299, 1)))
model_dr_bnr.add(MaxPooling2D(pool_size=(2, 2)))
model_dr_bnr.add(Conv2D(filters=64, kernel_size=(3, 3), padding='Same', activation='relu'))
model_dr_bnr.add(BatchNormalization())
model_dr_bnr.add(Activation('relu'))
model_dr_bnr.add(Dropout(0.25))
model_dr_bnr.add(MaxPooling2D(pool_size=(2, 2)))
model_dr_bnr.add(Flatten())
model_dr_bnr.add(Dense(units=128, activation='relu'))
model_dr_bnr.add(Dense(units=num_classes, activation='softmax'))
##################################
# Compiling the network layers
##################################
model_dr_bnr.compile(loss='categorical_crossentropy', optimizer='adam', metrics=[Recall()])
##################################
# Displaying the model summary
# for CNN with dropout and
# batch normalization regularization
##################################
print(model_dr_bnr.summary())
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 299, 299, 32) 320
max_pooling2d (MaxPooling2D (None, 149, 149, 32) 0
)
conv2d_1 (Conv2D) (None, 149, 149, 64) 18496
batch_normalization (BatchN (None, 149, 149, 64) 256
ormalization)
activation (Activation) (None, 149, 149, 64) 0
dropout (Dropout) (None, 149, 149, 64) 0
max_pooling2d_1 (MaxPooling (None, 74, 74, 64) 0
2D)
flatten (Flatten) (None, 350464) 0
dense (Dense) (None, 128) 44859520
dense_1 (Dense) (None, 3) 387
=================================================================
Total params: 44,878,979
Trainable params: 44,878,851
Non-trainable params: 128
_________________________________________________________________
None
##################################
# Displaying the model layers
# for CNN with dropout and
# batch normalization regularization
##################################
model_dr_bnr_layer_names = [layer.name for layer in model_dr_bnr.layers]
print("Layer Names:", model_dr_bnr_layer_names)
Layer Names: ['conv2d', 'max_pooling2d', 'conv2d_1', 'batch_normalization', 'activation', 'dropout', 'max_pooling2d_1', 'flatten', 'dense', 'dense_1']
##################################
# Displaying the number of weights
# for CNN with dropout and
# batch normalization regularization
##################################
for layer in model_dr_bnr.layers:
if hasattr(layer, 'weights'):
print(f"Layer: {layer.name}, Number of Weights: {len(layer.get_weights())}")
Layer: conv2d, Number of Weights: 2 Layer: max_pooling2d, Number of Weights: 0 Layer: conv2d_1, Number of Weights: 2 Layer: batch_normalization, Number of Weights: 4 Layer: activation, Number of Weights: 0 Layer: dropout, Number of Weights: 0 Layer: max_pooling2d_1, Number of Weights: 0 Layer: flatten, Number of Weights: 0 Layer: dense, Number of Weights: 2 Layer: dense_1, Number of Weights: 2
##################################
# Displaying the number of weights
# for CNN with dropout and
# batch normalization regularization
##################################
total_parameters = 0
for layer in model_dr_bnr.layers:
layer_parameters = layer.count_params()
total_parameters += layer_parameters
print(f"Layer: {layer.name}, Parameters: {layer_parameters}")
print("\nTotal Parameters in the Model:", total_parameters)
Layer: conv2d, Parameters: 320 Layer: max_pooling2d, Parameters: 0 Layer: conv2d_1, Parameters: 18496 Layer: batch_normalization, Parameters: 256 Layer: activation, Parameters: 0 Layer: dropout, Parameters: 0 Layer: max_pooling2d_1, Parameters: 0 Layer: flatten, Parameters: 0 Layer: dense, Parameters: 44859520 Layer: dense_1, Parameters: 387 Total Parameters in the Model: 44878979
##################################
# Fitting the model
# for CNN with dropout and
# batch normalization regularization
##################################
epochs = 100
set_seed()
model_dr_bnr_history = model_dr_bnr.fit(train_gen,
steps_per_epoch=len(train_gen) // batch_size,
validation_steps=len(test_gen) // batch_size,
validation_data=test_gen,
epochs=epochs,
verbose=0)
##################################
# Evaluating the model
# for CNN with dropout and
# batch normalization regularization
# on the independent validation set
##################################
model_dr_bnr_y_pred = model_dr_bnr.predict(test_gen)
45/45 [==============================] - 4s 97ms/step
##################################
# Plotting the loss profile
# for CNN with dropout and
# batch normalization regularization
# on the training and validation sets
##################################
plot_training_history(model_dr_bnr_history, 'CNN With Dropout and Batch Normalization Regularization : ')
##################################
# Consolidating the predictions
# for CNN with dropout and
# batch normalization regularization
# on the validation set
##################################
model_dr_bnr_predictions = np.array(list(map(lambda x: np.argmax(x), model_dr_bnr_y_pred)))
model_dr_bnr_y_true = test_gen.classes
##################################
# Formulating the confusion matrix
# for CNN with dropout and
# batch normalization regularization
# on the validation set
##################################
CMatrix = pd.DataFrame(confusion_matrix(model_dr_bnr_y_true, model_dr_bnr_predictions), columns=classes, index =classes)
##################################
# Calculating the model
# Precision, Recall, F-score and Support
# for CNN with dropout and
# batch normalization regularization
# for each category of the validation set
##################################
plt.figure(figsize=(10, 6))
ax = sns.heatmap(CMatrix, annot = True, fmt = 'g' ,vmin = 0, vmax = 250,cmap = 'icefire')
ax.set_xlabel('Predicted',fontsize = 14,weight = 'bold')
ax.set_xticklabels(ax.get_xticklabels(),rotation =0)
ax.set_ylabel('Actual',fontsize = 14,weight = 'bold')
ax.set_yticklabels(ax.get_yticklabels(),rotation =0)
ax.set_title('CNN With Dropout and Batch Normalization Regularization : Validation Set Confusion Matrix',fontsize = 16,weight = 'bold',pad=20);
##################################
# Resetting all states generated by Keras
##################################
keras.backend.clear_session()
##################################
# Calculating the model accuracy
# for CNN with dropout and
# batch normalization regularization
# for the entire validation set
##################################
model_dr_bnr_acc = accuracy_score(model_dr_bnr_y_true, model_dr_bnr_predictions)
##################################
# Calculating the model
# Precision, Recall, F-score and Support
# for CNN with dropout and
# batch normalization regularization
# for the entire validation set
##################################
model_dr_bnr_results_all = precision_recall_fscore_support(model_dr_bnr_y_true, model_dr_bnr_predictions, average='macro',zero_division = 1)
##################################
# Calculating the model
# Precision, Recall, F-score and Support
# for CNN with dropout and
# batch normalization regularization
# for each category of the validation set
##################################
model_dr_bnr_results_class = precision_recall_fscore_support(model_dr_bnr_y_true, model_dr_bnr_predictions, average=None, zero_division = 1)
##################################
# Consolidating all model evaluation metrics
# for CNN with dropout and
# batch normalization regularization
##################################
metric_columns = ['Precision','Recall', 'F-Score','Support']
model_dr_bnr_all_df = pd.concat([pd.DataFrame(list(model_dr_bnr_results_class)).T,pd.DataFrame(list(model_dr_bnr_results_all)).T])
model_dr_bnr_all_df.columns = metric_columns
model_dr_bnr_all_df.index = ['COVID', 'Normal', 'Viral Pneumonia','Total']
model_dr_bnr_all_df
| Precision | Recall | F-Score | Support | |
|---|---|---|---|---|
| COVID | 0.950450 | 0.879167 | 0.913420 | 240.0 |
| Normal | 0.779310 | 0.941667 | 0.852830 | 240.0 |
| Viral Pneumonia | 0.923077 | 0.800000 | 0.857143 | 240.0 |
| Total | 0.884279 | 0.873611 | 0.874464 | NaN |
##################################
# Consolidating all model evaluation metrics
# for CNN with dropout and
# batch normalization regularization
##################################
model_dr_bnr_model_list = []
model_dr_bnr_measure_list = []
model_dr_bnr_category_list = []
model_dr_bnr_value_list = []
for i in range(3):
for j in range(4):
model_dr_bnr_model_list.append('CNN_DR_BNR')
model_dr_bnr_measure_list.append(metric_columns[i])
model_dr_bnr_category_list.append(model_dr_bnr_all_df.index[j])
model_dr_bnr_value_list.append(model_dr_bnr_all_df.iloc[j,i])
model_dr_bnr_all_summary = pd.DataFrame(zip(model_dr_bnr_model_list,
model_dr_bnr_measure_list,
model_dr_bnr_category_list,
model_dr_bnr_value_list),
columns=['CNN.Model.Name',
'Model.Metric',
'Image.Category',
'Metric.Value'])
##################################
# Consolidating all the
# CNN model performance measures
##################################
cnn_model_performance_comparison = pd.concat([model_nr_all_summary,
model_dr_all_summary,
model_bnr_all_summary,
model_dr_bnr_all_summary],
ignore_index=True)
##################################
# Consolidating all the precision
# model performance measures
##################################
cnn_model_performance_comparison_precision = cnn_model_performance_comparison[cnn_model_performance_comparison['Model.Metric']=='Precision']
cnn_model_performance_comparison_precision_CNN_NR = cnn_model_performance_comparison_precision[cnn_model_performance_comparison_precision['CNN.Model.Name']=='CNN_NR'].loc[:,"Metric.Value"]
cnn_model_performance_comparison_precision_CNN_DR = cnn_model_performance_comparison_precision[cnn_model_performance_comparison_precision['CNN.Model.Name']=='CNN_DR'].loc[:,"Metric.Value"]
cnn_model_performance_comparison_precision_CNN_BNR = cnn_model_performance_comparison_precision[cnn_model_performance_comparison_precision['CNN.Model.Name']=='CNN_BNR'].loc[:,"Metric.Value"]
cnn_model_performance_comparison_precision_CNN_DR_BNR = cnn_model_performance_comparison_precision[cnn_model_performance_comparison_precision['CNN.Model.Name']=='CNN_DR_BNR'].loc[:,"Metric.Value"]
##################################
# Combining all the precision
# model performance measures
# for all CNN models
##################################
cnn_model_performance_comparison_precision_plot = pd.DataFrame({'CNN_NR': cnn_model_performance_comparison_precision_CNN_NR.values,
'CNN_DR': cnn_model_performance_comparison_precision_CNN_DR.values,
'CNN_BNR': cnn_model_performance_comparison_precision_CNN_BNR.values,
'CNN_DR_BNR': cnn_model_performance_comparison_precision_CNN_DR_BNR.values},
index=cnn_model_performance_comparison_precision['Image.Category'].unique())
cnn_model_performance_comparison_precision_plot
| CNN_NR | CNN_DR | CNN_BNR | CNN_DR_BNR | |
|---|---|---|---|---|
| COVID | 0.890295 | 0.906504 | 0.943231 | 0.950450 |
| Normal | 0.671378 | 0.904255 | 0.876448 | 0.779310 |
| Viral Pneumonia | 0.810000 | 0.758741 | 0.913793 | 0.923077 |
| Total | 0.790558 | 0.856500 | 0.911157 | 0.884279 |
##################################
# Plotting all the precision
# model performance measures
# for all CNN models
##################################
cnn_model_performance_comparison_precision_plot = cnn_model_performance_comparison_precision_plot.plot.barh(figsize=(10, 6), width=0.90)
cnn_model_performance_comparison_precision_plot.set_xlim(0.00,1.00)
cnn_model_performance_comparison_precision_plot.set_title("Model Comparison by Precision Performance on Validation Data")
cnn_model_performance_comparison_precision_plot.set_xlabel("Precision Performance")
cnn_model_performance_comparison_precision_plot.set_ylabel("Image Categories")
cnn_model_performance_comparison_precision_plot.grid(False)
cnn_model_performance_comparison_precision_plot.legend(loc='center left', bbox_to_anchor=(1.0, 0.5))
for container in cnn_model_performance_comparison_precision_plot.containers:
cnn_model_performance_comparison_precision_plot.bar_label(container, fmt='%.5f', padding=-50, color='white', fontweight='bold')
##################################
# Consolidating all the recall
# model performance measures
##################################
cnn_model_performance_comparison_recall = cnn_model_performance_comparison[cnn_model_performance_comparison['Model.Metric']=='Recall']
cnn_model_performance_comparison_recall_CNN_NR = cnn_model_performance_comparison_recall[cnn_model_performance_comparison_recall['CNN.Model.Name']=='CNN_NR'].loc[:,"Metric.Value"]
cnn_model_performance_comparison_recall_CNN_DR = cnn_model_performance_comparison_recall[cnn_model_performance_comparison_recall['CNN.Model.Name']=='CNN_DR'].loc[:,"Metric.Value"]
cnn_model_performance_comparison_recall_CNN_BNR = cnn_model_performance_comparison_recall[cnn_model_performance_comparison_recall['CNN.Model.Name']=='CNN_BNR'].loc[:,"Metric.Value"]
cnn_model_performance_comparison_recall_CNN_DR_BNR = cnn_model_performance_comparison_recall[cnn_model_performance_comparison_recall['CNN.Model.Name']=='CNN_DR_BNR'].loc[:,"Metric.Value"]
##################################
# Combining all the recall
# model performance measures
# for all CNN models
##################################
cnn_model_performance_comparison_recall_plot = pd.DataFrame({'CNN_NR': cnn_model_performance_comparison_recall_CNN_NR.values,
'CNN_DR': cnn_model_performance_comparison_recall_CNN_DR.values,
'CNN_BNR': cnn_model_performance_comparison_recall_CNN_BNR.values,
'CNN_DR_BNR': cnn_model_performance_comparison_recall_CNN_DR_BNR.values},
index=cnn_model_performance_comparison_recall['Image.Category'].unique())
cnn_model_performance_comparison_recall_plot
| CNN_NR | CNN_DR | CNN_BNR | CNN_DR_BNR | |
|---|---|---|---|---|
| COVID | 0.879167 | 0.929167 | 0.900000 | 0.879167 |
| Normal | 0.791667 | 0.708333 | 0.945833 | 0.941667 |
| Viral Pneumonia | 0.675000 | 0.904167 | 0.883333 | 0.800000 |
| Total | 0.781944 | 0.847222 | 0.909722 | 0.873611 |
##################################
# Plotting all the recall
# model performance measures
# for all CNN models
##################################
cnn_model_performance_comparison_recall_plot = cnn_model_performance_comparison_recall_plot.plot.barh(figsize=(10, 6), width=0.90)
cnn_model_performance_comparison_recall_plot.set_xlim(0.00,1.00)
cnn_model_performance_comparison_recall_plot.set_title("Model Comparison by Recall Performance on Validation Data")
cnn_model_performance_comparison_recall_plot.set_xlabel("Recall Performance")
cnn_model_performance_comparison_recall_plot.set_ylabel("Image Categories")
cnn_model_performance_comparison_recall_plot.grid(False)
cnn_model_performance_comparison_recall_plot.legend(loc='center left', bbox_to_anchor=(1.0, 0.5))
for container in cnn_model_performance_comparison_recall_plot.containers:
cnn_model_performance_comparison_recall_plot.bar_label(container, fmt='%.5f', padding=-50, color='white', fontweight='bold')
##################################
# Consolidating all the f-score
# model performance measures
##################################
cnn_model_performance_comparison_fscore = cnn_model_performance_comparison[cnn_model_performance_comparison['Model.Metric']=='F-Score']
cnn_model_performance_comparison_fscore_CNN_NR = cnn_model_performance_comparison_fscore[cnn_model_performance_comparison_fscore['CNN.Model.Name']=='CNN_NR'].loc[:,"Metric.Value"]
cnn_model_performance_comparison_fscore_CNN_DR = cnn_model_performance_comparison_fscore[cnn_model_performance_comparison_fscore['CNN.Model.Name']=='CNN_DR'].loc[:,"Metric.Value"]
cnn_model_performance_comparison_fscore_CNN_BNR = cnn_model_performance_comparison_fscore[cnn_model_performance_comparison_fscore['CNN.Model.Name']=='CNN_BNR'].loc[:,"Metric.Value"]
cnn_model_performance_comparison_fscore_CNN_DR_BNR = cnn_model_performance_comparison_fscore[cnn_model_performance_comparison_fscore['CNN.Model.Name']=='CNN_DR_BNR'].loc[:,"Metric.Value"]
##################################
# Combining all the f-score
# model performance measures
# for all CNN models
##################################
cnn_model_performance_comparison_fscore_plot = pd.DataFrame({'CNN_NR': cnn_model_performance_comparison_fscore_CNN_NR.values,
'CNN_DR': cnn_model_performance_comparison_fscore_CNN_DR.values,
'CNN_BNR': cnn_model_performance_comparison_fscore_CNN_BNR.values,
'CNN_DR_BNR': cnn_model_performance_comparison_fscore_CNN_DR_BNR.values},
index=cnn_model_performance_comparison_fscore['Image.Category'].unique())
cnn_model_performance_comparison_fscore_plot
| CNN_NR | CNN_DR | CNN_BNR | CNN_DR_BNR | |
|---|---|---|---|---|
| COVID | 0.884696 | 0.917695 | 0.921109 | 0.913420 |
| Normal | 0.726577 | 0.794393 | 0.909820 | 0.852830 |
| Viral Pneumonia | 0.736364 | 0.825095 | 0.898305 | 0.857143 |
| Total | 0.782546 | 0.845728 | 0.909744 | 0.874464 |
##################################
# Plotting all the fscore
# model performance measures
# for all CNN models
##################################
cnn_model_performance_comparison_fscore_plot = cnn_model_performance_comparison_fscore_plot.plot.barh(figsize=(10, 6), width=0.90)
cnn_model_performance_comparison_fscore_plot.set_xlim(0.00,1.00)
cnn_model_performance_comparison_fscore_plot.set_title("Model Comparison by F-Score Performance on Validation Data")
cnn_model_performance_comparison_fscore_plot.set_xlabel("F-Score Performance")
cnn_model_performance_comparison_fscore_plot.set_ylabel("Image Categories")
cnn_model_performance_comparison_fscore_plot.grid(False)
cnn_model_performance_comparison_fscore_plot.legend(loc='center left', bbox_to_anchor=(1.0, 0.5))
for container in cnn_model_performance_comparison_fscore_plot.containers:
cnn_model_performance_comparison_fscore_plot.bar_label(container, fmt='%.5f', padding=-50, color='white', fontweight='bold')
##################################
# Visualizing the learned and updated filters
# for the first convolutional layer
# from the selected CNN model defined as
# CNN with batch normalization regularization
##################################
conv2d_0_filters, conv2d_0_biases = model_bnr.layers[0].get_weights()
plt.figure(figsize=(10, 6))
for i in range(conv2d_0_filters.shape[3]):
plt.subplot(4, 8, i+1)
plt.imshow(conv2d_0_filters[:, :, 0, i], cmap='Oranges')
plt.axis('off')
plt.show()
##################################
# Visualizing the learned and updated filters
# for the second convolutional layer
# from the selected CNN model defined as
# CNN with batch normalization regularization
##################################
conv2d_1_filters, conv2d_1_biases = model_bnr.layers[2].get_weights()
plt.figure(figsize=(10, 12))
for i in range(conv2d_1_filters.shape[3]):
plt.subplot(8, 8, i+1)
plt.imshow(conv2d_1_filters[:, :, 0, i], cmap='Oranges')
plt.axis('off')
plt.show()
##################################
# Gathering the actual and predicted classes
# from the selected CNN model defined as
# CNN with batch normalization regularization
##################################
model_bnr_predictions = np.array(list(map(lambda x: np.argmax(x), model_bnr_y_pred)))
model_bnr_y_true = test_gen.classes
##################################
# Consolidating the actual and predicted classes
# from the selected CNN model defined as
# CNN with batch normalization regularization
##################################
class_indices = test_gen.class_indices
indices = {v:k for k,v in class_indices.items()}
filenames = test_gen.filenames
test_gen_df = pd.DataFrame()
test_gen_df['FileName'] = filenames
test_gen_df['Actual_Category'] = model_bnr_y_true
test_gen_df['Predicted_Category'] = model_bnr_predictions
test_gen_df['Actual_Category'] = test_gen_df['Actual_Category'].apply(lambda x: indices[x])
test_gen_df['Predicted_Category'] = test_gen_df['Predicted_Category'].apply(lambda x: indices[x])
test_gen_df.loc[test_gen_df['Actual_Category']==test_gen_df['Predicted_Category'],'Matched_Category_Prediction'] = True
test_gen_df.loc[test_gen_df['Actual_Category']!=test_gen_df['Predicted_Category'],'Matched_Category_Prediction'] = False
test_gen_df.head(10)
| FileName | Actual_Category | Predicted_Category | Matched_Category_Prediction | |
|---|---|---|---|---|
| 0 | COVID\COVID-1.png | COVID | COVID | True |
| 1 | COVID\COVID-10.png | COVID | COVID | True |
| 2 | COVID\COVID-100.png | COVID | COVID | True |
| 3 | COVID\COVID-1000.png | COVID | Viral Pneumonia | False |
| 4 | COVID\COVID-1001.png | COVID | COVID | True |
| 5 | COVID\COVID-1002.png | COVID | COVID | True |
| 6 | COVID\COVID-1003.png | COVID | COVID | True |
| 7 | COVID\COVID-1004.png | COVID | COVID | True |
| 8 | COVID\COVID-1005.png | COVID | COVID | True |
| 9 | COVID\COVID-1006.png | COVID | COVID | True |
##################################
# Formulating image samples
# from the validation set
##################################
test_gen_df = test_gen_df.sample(frac=1, replace=False, random_state=123).reset_index(drop=True)
##################################
# Defining a function
# to load the sampled images
##################################
img_size=299
def readImage(path):
img = load_img(path,color_mode="grayscale", target_size=(img_size,img_size))
img = img_to_array(img)
img = img/255.
return img
##################################
# Defining a function
# to display the sampled images
# with the actual and predicted categories
##################################
def display_images(temp_df):
temp_df = temp_df.reset_index(drop=True)
plt.figure(figsize = (20 , 20))
n = 0
for i in range(15):
n+=1
plt.subplot(5 , 5, n)
plt.subplots_adjust(hspace = 0.5 , wspace = 0.3)
image = readImage(f"C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset/{temp_df.FileName[i]}")
plt.imshow(image)
plt.title(f'A: {temp_df.Actual_Category[i]} P: {temp_df.Predicted_Category[i]}')
##################################
# Display sample images with matched
# actual and predicted categories
##################################
display_images(test_gen_df[test_gen_df['Matched_Category_Prediction']==True])
##################################
# Display sample images with mismatched
# actual and predicted categories
##################################
display_images(test_gen_df[test_gen_df['Matched_Category_Prediction']!=True])
##################################
# Defining a function
# to gather the model layer information
# and formulate the gradient class activation map
# from the output of the first convolutional layer
##################################
def make_gradcam_heatmap(img_array, model, pred_index=None):
grad_model = Model(inputs=model.inputs, outputs=[model.layers[0].output, model.output])
with tf.GradientTape() as tape:
last_conv_layer_output, preds = grad_model(img_array)
if pred_index is None:
pred_index = tf.argmax(preds[0])
class_channel = preds[:, pred_index]
grads = tape.gradient(class_channel, last_conv_layer_output)
pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2))
last_conv_layer_output = last_conv_layer_output[0]
heatmap = last_conv_layer_output @ pooled_grads[..., tf.newaxis]
heatmap = tf.squeeze(heatmap)
heatmap = tf.maximum(heatmap, 0) / tf.math.reduce_max(heatmap)
return heatmap.numpy(), preds
##################################
# Defining a function
# to colorize the generated heatmap
# and superimpose on the actual image
##################################
def gradCAMImage(image):
path = f"C:/Users/John pauline magno/Python Notebooks/COVID-19_Radiography_Dataset/{image}"
img = readImage(path)
img = np.expand_dims(img,axis=0)
heatmap, preds = make_gradcam_heatmap(img, model_bnr)
img = load_img(path)
img = img_to_array(img)
heatmap = np.uint8(255 * heatmap)
jet = cm.get_cmap("jet")
jet_colors = jet(np.arange(256))[:, :3]
jet_heatmap = jet_colors[heatmap]
jet_heatmap = tf.keras.preprocessing.image.array_to_img(jet_heatmap)
jet_heatmap = jet_heatmap.resize((img.shape[1], img.shape[0]))
jet_heatmap = tf.keras.preprocessing.image.img_to_array(jet_heatmap)
superimposed_img = jet_heatmap * 0.8 + img
superimposed_img = tf.keras.preprocessing.image.array_to_img(superimposed_img)
return superimposed_img
##################################
# Defining a function to consolidate
# the gradient class activation maps
# for a subset of sampled images
##################################
def gradcam_of_images(correct_class):
grad_images = []
title = []
temp_df = test_gen_df[test_gen_df['Matched_Category_Prediction']==correct_class]
temp_df = temp_df.reset_index(drop=True)
for i in range(15):
image = temp_df.FileName[i]
grad_image = gradCAMImage(image)
grad_images.append(grad_image)
title.append(f"A: {temp_df.Actual_Category[i]} P: {temp_df.Predicted_Category[i]}")
return grad_images, title
##################################
# Consolidating the gradient class activation maps
# from the output of the first convolutional layer
# for the subset of sampled images
# with matched actual and predicted categories
##################################
matched_categories, matched_categories_titles = gradcam_of_images(correct_class=True)
C:\Users\John pauline magno\AppData\Local\Temp\ipykernel_20616\3334071115.py:16: MatplotlibDeprecationWarning: The get_cmap function was deprecated in Matplotlib 3.7 and will be removed two minor releases later. Use ``matplotlib.colormaps[name]`` or ``matplotlib.colormaps.get_cmap(obj)`` instead.
jet = cm.get_cmap("jet")
##################################
# Consolidating the gradient class activation maps
# from the output of the first convolutional layer
# for the subset of sampled images
# with mismatched actual and predicted categories
##################################
mismatched_categories, mismatched_categories_titles = gradcam_of_images(correct_class=False)
C:\Users\John pauline magno\AppData\Local\Temp\ipykernel_20616\3334071115.py:16: MatplotlibDeprecationWarning: The get_cmap function was deprecated in Matplotlib 3.7 and will be removed two minor releases later. Use ``matplotlib.colormaps[name]`` or ``matplotlib.colormaps.get_cmap(obj)`` instead.
jet = cm.get_cmap("jet")
##################################
# Defining a function to display
# the consolidated gradient class activation maps
# for a subset of sampled images
##################################
def display_heatmaps(classified_images, titles):
plt.figure(figsize = (20 , 20))
n = 0
for i in range(15):
n+=1
plt.subplot(5 , 5, n)
plt.subplots_adjust(hspace = 0.5 , wspace = 0.3)
plt.imshow(classified_images[i])
plt.title(titles[i])
plt.show()
##################################
# Displaying the consolidated
# gradient class activation maps
# from the output of the first convolutional layer
# for the subset of sampled images
# with matched actual and predicted categories
##################################
display_heatmaps(matched_categories, matched_categories_titles)
##################################
# Displaying the consolidated
# gradient class activation maps
# from the output of the first convolutional layer
# for the subset of sampled images
# with mismatched actual and predicted categories
##################################
display_heatmaps(mismatched_categories, mismatched_categories_titles)
##################################
# Defining a function
# to gather the model layer information
# and formulate the gradient class activation map
# from the output of the second convolutional layer
##################################
def make_gradcam_heatmap(img_array, model, pred_index=None):
grad_model = Model(inputs=model.inputs, outputs=[model.layers[2].output, model.output])
with tf.GradientTape() as tape:
last_conv_layer_output, preds = grad_model(img_array)
if pred_index is None:
pred_index = tf.argmax(preds[0])
class_channel = preds[:, pred_index]
grads = tape.gradient(class_channel, last_conv_layer_output)
pooled_grads = tf.reduce_mean(grads, axis=(0, 1, 2))
last_conv_layer_output = last_conv_layer_output[0]
heatmap = last_conv_layer_output @ pooled_grads[..., tf.newaxis]
heatmap = tf.squeeze(heatmap)
heatmap = tf.maximum(heatmap, 0) / tf.math.reduce_max(heatmap)
return heatmap.numpy(), preds
##################################
# Consolidating the gradient class activation maps
# from the output of the second convolutional layer
# for the subset of sampled images
# with matched actual and predicted categories
##################################
matched_categories, matched_categories_titles = gradcam_of_images(correct_class=True)
C:\Users\John pauline magno\AppData\Local\Temp\ipykernel_20616\3334071115.py:16: MatplotlibDeprecationWarning: The get_cmap function was deprecated in Matplotlib 3.7 and will be removed two minor releases later. Use ``matplotlib.colormaps[name]`` or ``matplotlib.colormaps.get_cmap(obj)`` instead.
jet = cm.get_cmap("jet")
##################################
# Consolidating the gradient class activation maps
# from the output of the second convolutional layer
# for the subset of sampled images
# with mismatched actual and predicted categories
##################################
mismatched_categories, mismatched_categories_titles = gradcam_of_images(correct_class=False)
C:\Users\John pauline magno\AppData\Local\Temp\ipykernel_20616\3334071115.py:16: MatplotlibDeprecationWarning: The get_cmap function was deprecated in Matplotlib 3.7 and will be removed two minor releases later. Use ``matplotlib.colormaps[name]`` or ``matplotlib.colormaps.get_cmap(obj)`` instead.
jet = cm.get_cmap("jet")
##################################
# Displaying the consolidated
# gradient class activation maps
# from the output of the second convolutional layer
# for the subset of sampled images
# with matched actual and predicted categories
##################################
display_heatmaps(matched_categories, matched_categories_titles)
##################################
# Displaying the consolidated
# gradient class activation maps
# from the output of the second convolutional layer
# for the subset of sampled images
# with mismatched actual and predicted categories
##################################
display_heatmaps(mismatched_categories, mismatched_categories_titles)
from IPython.display import display, HTML
display(HTML("<style>.rendered_html { font-size: 15px; font-family: 'Trebuchet MS'; }</style>"))